Degree Distributions of Growing Networks 
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The in-degree and out-degree distributions of a growing network model are determined. The in- 
degree is the number of incoming links to a given node (and vice versa for out-degree). The network 
is built by (i) creation of new nodes which each immediately attach to a pre-existing node, and (ii) 
creation of new links between pre-existing nodes. This process naturally generates correlated in- and 
out-degree distributions. When the node and link creation rates are linear functions of node degree, 
these distributions exhibit distinct power-law forms. By tuning the parameters in these rates to 
reasonable values, exponents which agree with those of the web graph are obtained. 
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The world-wide web (WWW) is a rapidly evolving net- 
work which now contains nearly 10 9 nodes. Much recent 
effort has been devoted to characterizing the underlying 
directed graph formed by these nodes and their connect- 
ing hyperlinks - the so-called "web" graph Jl]-|4|] . In par- 
allel with these developments, a variety of growing net- 
work models have recently been introduced and studied 
These model networks are built by sequentially 
adding both nodes and links in a manner which mimics 
the evolution of real network systems, with the WWW 
being the most obvious example. 

One fundamental characteristic of any graph is the 
number of links at a node - the node degree. The growing 
network models cited above predict that the distribution 
of node degree has a power law form for growth rules 
in which the probability that a newly-created node at- 
taches to a pre-existing node increases linearly with the 
degree of the "target" node |^ || . This power law behav- 
ior strongly contrasts with the Poisson degree distribu- 
tion of the classical random graphs ]l5| , where links are 
randomly created between any pair of pre-existing nodes 
in the network. 
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FIG. 1. A node with in-degree i 
total degree 9. 
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4, out-degree j = 5, and 



Since web links are directed, the total degree of a node 
may naturally be resolved into the in-degree - the num- 
ber of incoming links to a node, and out-degree - the 
number of outgoing links from a node (Fig. W- While 
the total node degree and its distribution are now rea- 
sonably understood [p|~p|,pl|, little is known about the 
joint distribution of in-degrees and out-degrees, as well 
as their correlation. Empirical measurements of the web 
indicate that in-degree and out-degree distributions ex- 
hibit power-law behaviors with different exponents ■ 
In this Letter, we solve for the joint distribution in a 



simple growing network model. We are able to repro- 
duce the observed in-degree and out-degree distributions 
of the web as well as find correlations between in- and 
out-degrees of each node. 

Our model represents an extension of growing network 
models with node and link creation |l3| , |l4| to incorpo- 
rate link directionality. The network growth occurs by 
two distinct processes (Fig. ||): 

(i) With probability p, a new node is introduced and 
it immediately attaches to one of the earlier target 
nodes in the network. The attachment probability 
depends only on the in-degree of the target. 

(ii) With probability q = 1 — p, a new link is created 
between already existing nodes. The choices of the 
originating and target nodes depend on the out- 
degree of the originating node and the in-degree of 
the target node. 





(i) (ii) 

FIG. 2. Illustration of the growth processes in the growing 
network model: (i) node creation and immediate attachment, 
and (ii) link creation. In (i) the new node is shaded, while in 
both (i) and (ii) the new link is dashed. 

If only process (i) was allowed, the out-degree of each 
node would be one by construction. Process (ii) has been 
shown to drive a transition in the network structure [jjdj . 
We shall further show that this general model gives a 
non-trivial out-degree distribution which is distinct from 
the in-degree distribution. 

We begin our analysis by determining the average node 
degree; this can be done without specification of the at- 
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tachment and link creation probabilities. Let N(t) be the 
total number of nodes in the network, and let I(t) and 
J(t) be the total in-degree and out-degree, respectively. 
According to the two basic growth processes enumerated 
above, at each time step these degrees evolve according 
to one of the following two possibilities 



(N,I,J) 



(JV + 1,1 + 1, J 
(N,I + 1,J+1) 



1) probability p, 
probability q. 



(1) 



That is, with probability p a new node and new directed 
link are created (Fig. |2|) so that the number of nodes 
and both node degrees increase by one. Conversely, with 
probability q a new directed link is created and the node 
degrees each increase by one, while the total number of 
nodes is unchanged. As a result, 



N(t) = pt, I{t) = J(t) = t, 



(2) 



from which we immediately conclude that the average in- 
and out-degrees, V in = I(t)/N(t) and P out = J{t)/N(t), 
are both time independent and equal to I /p. 

To determine the joint degree distributions, we need 
to specify: (i) the attachment rate A(i,j), defined as the 
probability that a newly-introduced node links to an ex- 
isting node with i incoming and j outgoing links, and 
(ii) the creation rate C{i\, ji\i2, ]2), defined as the prob- 
ability of adding a new link from a (i\,ji) node to a 
(*2,j2) node. We restrict the form of these rates to those 
which we naturally expect to occur in systems such as 
the web graph. First, we assume that the attachment 
rate depends only on the in-degree of the target node, 
= Ai. We also assume that the link creation rate 
depends only on the out-degree of the node from which 
it emanates and the in-degree of the target node, that is, 
C(ii,ji\i 2) 32) = C(ji,i 2 ). 

On general grounds, the attachment and creation rates 
Ai and C(j, i) should be non-decreasing functions of i and 
j. For example, a web-page designer is more likely to 
construct hyperlinks to well-known pages rather than to 
obscure pages. Similarly, a web page with many outgoing 
hyperlinks is more likely to create even more hyperlinks. 
We have found that the degree distributions exhibit qual- 
itatively different behaviors depending on whether the 
asymptotic dependence of the rates Ai and C(j,i) on 
both i and j grow slower than linearly, linearly, or faster 
than linearly. The first and last cases lead to either 
rapidly decaying degree distributions or to the dominance 
of a single node; this same behavior was already found 
for the total node degree . The most interesting be- 
havior arises for asymptotically linear rates, and we focus 
on this class of models in our investigations. 

Specifically, we consider the model with attachment 
and creation rates which are shifted linear functions in 
all indices (linear-bilinear rates) 



i + C(j,i) = (i + X)(j + fx). 



(3) 



An intuitively natural feature of this model is that both 
the attachment and creation rates have the same depen- 



dence on the popularity of the target node. The pa- 
rameters A and p in the rates of Eq. (||) must obey the 
constraints A > and p > —1 to ensure that the corre- 
sponding rates are positive for all permissible values of 
in- and out-degrees, i > and j > 1. 

As the network grows, the joint degree distribution, 
Nij(t), defined as the average number of nodes with i 
incoming and j outgoing links, builds up. To solve for 
Nij(t), we shall use the rate equation approach, which 
has recently been applied to simpler versions of growing 
networks When the attachment and creation 

rates are given by Eq. (||) , the degree distribution (t) 
evolves according to the rate equations 



dN i: 
~dt 



= (p + q) 



(i - 1 + A)JVj-i,j - (i + X)Nj. 
I+XN 



(4) 
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(j-l + ^JVij-i -{j + (i)Nj. 
J + pN 
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The first group of terms on the right-hand side account 
for the changes in the in-degree of target nodes. These 
changes arise by simultaneous creation of a new node 
and link (with probability p) or by creation of a new 
link only (with probability q). For example, the cre- 
ation of a link to a node with in-degree i leads to a 
loss in the number of such nodes. This occurs with rate 
(p + q)(i + X)Nij, divided by the appropriate normaliza- 
tion factor Y li ,j( i + > ') N i3 = I + ^N. The factor p + q = 1 
in Eq. (||) has been written to make explicit the two types 
of relevant processes. Similarly, the terms in the second 
group of terms accounts for changes in the out-degree. 
These occur due to the creation of new links between al- 
ready existing nodes - hence the prefactor q. The last 
term accounts for the continuous introduction of new 
nodes with no incoming links and one outgoing link. As 
a useful self-consistency check, we can easily verify that 
the total number of nodes, N = ■ Nij, obeys N = p, 
in agreement with Eq. (0). In the same spirit, the total 
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^ iNij and J 



), it is clear that the 



tj grow linearly with time. Accordingly, we substitute 



in- and out-degrees, I 

obey I = J = 1. 

By solving the first few of Eqs 

N, " 

Nij(t) = tnij, as well as N = pt and I = J = t, into 
Eqs. to yield a recursion relation for uy. Using the 
shorthand notations, 

a = q——!— and 6 = 1 + (1 + »)A, 
1 + pp, 

the recursion relation for simplifies to 

[i + a(j + n) + b]nij = (i - 1 + \)m-ij 



+ a(j - 1 + 

+ p(l +p\)5 i0 5ji. 



(5) 



We first consider the in-degree and out-degree distribu- 
tions, Xi(t) = J2j Nij(t) and Oj(t) = ]T\ N i3 (t). Because 
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of the linear time dependence of the nodes degrees, we 
write X. L (t) — t l. L and 3 (t) = tOj. The densities h and 
Oj satisfy 

(t + b)Ii = (i - 1 + A)Jj_i + p(l + P A)<5 l0 , (6) 
j + - + -) = (i - 1 + m)0,-i + P l - ± ^5 jl , (7) 

respectively. The solution to these recursion formulae 
may be expressed in terms of the following ratios of 
gamma functions 



T{i + X)T{b+ 1) 



r(» + 6 + i)r(A)' 

with Iq = p(l + p\)/b, and 

r(j + M ) T{2 + q - l + m - 1 ) 



Oj = Or 



r(j + i + g- 1 + M-7- 1 )r(i + M )' 



(8) 



(9) 



with O x =p{l+pn)/{l + q + n). 

From the asymptotics of the gamma function, the 
asymptotic behavior of the in- and out-degree distribu- 
tions have the power law forms, 

7i~r Wto , v m = 2 + p\ (10) 

Oj ~ r" out , fw = i + q' 1 + w<T l - (11) 

These exponents for the degree distributions constitute 
one of our primary results. Note that i^; n depends on A 
(an in-degree feature) while z/ ou t depends on /i (an out- 
degree feature). Notice also that both the exponents are 
greater than 2. 

We can also solve the recursion relation (||) for 
when i or j is small. For example, we can express nn 
as the ratio of two gamma functions. Then we can ex- 
press rii2 as the sum of two such ratios, etc. While there 
appears to be no simple general expression for the joint 
distribution, we can extract the limiting behaviors of riij 
when i or j is large. We find 



with 



6n = fil 
Cout = v o 



q (u in - l)(v out ~ 2) 
P Vout - 1 

1 (^out - l)(^in - 2) 

P V in - 1 



(12) 



(13) 



Thus the in- and out-degrees of a node are correlated 



otherwise, we would have n,j = UOj ~ j " in j " out . This 



correlation between node degrees is our second basic re- 
sult. 

The analytical form of the joint distribution greatly 
simplifies when v ln = z/ out , corresponding to a = 1 and 
p, + 6 = 2A. In this region of the parameter space, the 
recursion relation (^) reduces to 



(i + j + 2\)n t j = (i — 1 + A)nj_ij 

+ (j - l + 

+ p(l +pX)5 i0 6ji. 



(14) 



Equation (|l4|) is simpler than the general recursion (|^) 
since the node degrees i and j now appear with equal 
prefactors. This feature allows us to transform Eq. (14|) 
into a constant-coefficient recursion relation. Indeed, the 
substitution 



reduces 1X14) to 



T(i + X)T(J + fi) 
T(i+j + 2X + l) 



(15) 



(16) 



with 7 = p(l + pA) T(l + 2A)/(r(A) T(p + l)). We solve 
Eq. ( p6[ ) by the generating function technique. Multiply- 
ing Eq. (|16|) by x % yi and summing over all i > 0, j > 1 
yields 



72/ 



x - y 



»=o j=l 

Expanding this latter expression we obtain 

r(»+i) 



rriij = 7 



r(z + i)ror 



(17) 



(18) 



Combining Eqs. @ and © gives the joint in- and out- 
degree distribution 



T(i + X)T{j + fi)T(i + j) 
T(i + l)T(j)T(i + j + 2X + 1)' 



(19) 



In analogy to Eq. (h2|), this joint distribution reduces to 
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2A+1 ' 



(20) 



in the limit i — > oo and j — > oo. 

Another manifestation of the correlation in the degree 
distribution becomes evident by fixing the in-degree i 
and allowing the out-degree j to vary. We find that riy 
reaches a maximum value when j — ip/[2 + (1 + p)A] 
(here we consider large i and assume that /i > 0). Corre- 
spondingly, the average out-degree always scales linearly 
with the in-degree, (j) = i(p + 1)/[(1 +p)A] (here the co- 
efficient is always positive). Thus popular nodes - those 
with large in-degree - also tend to have large out-degrees. 
A dual property also holds: Nodes with large out-degree 
- those where many links originate - also tend to be pop- 
ular. 

Let us now compare our predictions with empirical ob- 
servations for the world-wide web. The relevant results 
for the node degrees are 



2.1, t/out « 2.7, V in = V 



7.5, (21) 
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Setting the observed value T>- in = V out = 7.5 to p~ x (see 
the discussion following Eq. (||)) we see that the predic- 
tions (JlO|) — match the observed values of the in- and 
out-degree exponents when A = 0.75 and /i = 3.55, re- 
spectively. With these parameter values we also have 
£i n « 5.0 and £ ou t ~ 3.9. Empirical measurements of 
these exponents would provide a definitive test of our 
model. 

We have also investigated a simplified model with node 
creation rate A4 = i + A, as above, but with link creation 
rate C(j, i) = j + fi, which does not depend on the pop- 
ularity of the target node i (linear-linear rates). For this 
model, the rate equations for the evolution of the num- 
ber of nodes with degrees (i,j) have a similar structure 
to Eqs. (|4|) and they can be solved by the same approach 
as that given for the network with linear-bilinear growth 
rates. We find that the in- and out-degree distributions 
again have power-law forms. Moreover, the out-degree 
exponent is still given by Eq. (|l(]), while the value of 
the in-degree exponent is now v m = 1 + A + p -1 . If we 
set p^ 1 — 7.5 to reproduce the correct average degree of 
the web graph, we see that v m must be larger than 8.5. 
Similarly the linear-linear model with Ai = i + A and 
C(j, i) = i + n gives a power-law in-degree distribution 
but the exponential out-degree distribution Oj — p 2 q^ 1 . 
Therefore linear-linear rate models cannot match empir- 
ical observations from the web. 

Parenthetically, we can also solve completely the grow- 
ing network with both constant node creation rate and 
constant link creation rate, Ai = 1 and C(j, i) = 1. While 
not necessarily a realistic model, it provides a useful ex- 
actly solvable case. By following the basic steps of the 
rate equation approach, we find the joint distribution 



T(i+j) 
2 i+ i T{i + l)r(j) ' 



(22) 



from which we deduce the in- and out-degree distribu- 
tions: Ii = p 2 /(l + p) l+1 and Oj = p 2 qi~ 1 . Again, the 
in- and out-degrees of a node are correlated. 

In summary, we have studied a growing network model 
which incorporates: (i) node creation and immediate at- 
tachment to a pre-existing node, and (ii) link creation 
between pre-existing nodes. The combination of these 
two processes naturally leads to non-trivial in-degree and 
out-degree distributions. We computed many structural 
properties of the resulting network by solving the rate 
equations for the evolution of the number of nodes with 
given in- and out-degree. For link attachment rate lin- 
ear in the target node degree and also link creation rate 
linear in the degrees of the two end nodes, power-law in- 
and out-degree distributions are dynamically generated. 
By choosing the parameters of the growth rates in a nat- 
ural manner these exponents can be brought into accord 
with recent measurements of the web. Within this class 



of models, the linear-bilinear growth rates appears to be 
a viable candidate for describing the link structure of the 
web graph. The model also predicts power-law behav- 
ior when e.g., the in-degree is fixed and the out-degree 
varies. Significant correlations between the in- and out- 
degrees of a node develop spontaneously, in agreement 
with everyday experience. Quantitative measurements of 
correlations in the web graph would test our model and 
help construct a more realistic model of the world-wide 
web. 

We are grateful for financial support of this work from 
NSF grant DMR9978902 and ARO grant DAAD19-99-1- 
0173 (PLK and SR), and a grant from the EPSRC (GJR). 



[1] 
[2] 

[3] 

[4] 

[5] 
[6] 

[7] 

[8] 
[9] 

[10] 
[11] 

[12] 
[13] 

[14] 

[15] 



B. A. Huberman, P. L. T. Pirolli, J. E. Pitkow, and 
R. Lukose, Science 280, 95 (1998) ; S. M. Maurer and 



B. A. Huberman, [nlin.CD/0003041 . 

J. Kleinberg, R. Kumar, P. Raghavan, S. Rajagopalan, 
and A. Tomkins, in: Proceedings of the International 
Conference on Combinatorics and Computing, Lecture 
Notes in Computer Science, Vol. 1627 (Springer- Verlag, 
Berlin, 1999). 

S. R. Kumar, P. Raghavan, S. Rajagopalan, and 
A. Tomkins, in: Proceedings of the 25th Very Large 
Databases Conference, Edinburgh, Scotland, 1999 (Mor- 
gan Kaufman, Orlando, FL, 1999). 

A. Broder, R. Kumar, F. Maghoul, P. Raghavan, S. Ra- 
jagopalan, R. Stata, A. Tomkins, and J. Wiener, Com- 
puter Networks 33, 309 (2000). 

A. L. Barabasi and R. Albert, Science 286, 509 (1999). 
S. N. Dorogovtsev and J. F. F. Mendes, Phys. Rev. E 62, 
1842 (2000). 

P. L. Krapivsky, S. Redner, and F. Leyvraz, Phys. Rev. 
Lett. 85, 4629 (2000). 

S. N. Dorogovtsev, J. F. F. Mendes, and A. N. Samukhin, 
Phys. Rev. Lett. 85, 4633 (2000). 

M. E. J. Newman, S. H. Strogatz, and D. J. Watts, cond- 
maf/0007235. 

G. Bianconi and A. L. Barabasi, cond-mai/0011029. 

P. L. Krapivsky and S. Redner, Phys. Rev. E 63, xxxx 

(2001) [cond-mat/ 0011094]. 

B. Tadic, Physica A 293, 273 (2001). 

The earliest network model was proposed by H. A. Simon, 
Biometrica 42, 425 (1955) to describe word frequency; 
see also S. Bornholdt and H. Ebel, cond-mai/0008465. 
R. Albert and A.-L. Barabasi, Phys. Rev. Lett. 85, 5234 
(2000); S. N. Dorogovtsev and J. F. F. Mendes, Euro- 
phys. Lett. 52, 33 (2000). 

B. Bollobas, Random Graphs (Academic Press, London, 
1985); S. Janson, T. Luczak, and A. Rucinski, Random 
Graphs (Wiley, New York, 2000). 



4 



